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Abstract 



We discuss a characterization of the centered Gaussian distribu- 
tion which can be read from results of Archimedes and Maxwell, and 
relate it to Charles Stein's well-known characterization of the same 
distribution. These characterizations fit into a more general frame- 
work involving the beta-gamma algebra, which explains some other 
characterizations appearing in the Stein's method literature. 

1 CHARACTERIZING THE GAUSSIAN DISTRIBUTION 

One of Archimedes' proudest accomplishments was a proof that the surface 
area of a sphere is equal to the surface area of the tube of the smallest 
cylinder containing it; see Figure[TJ Legend has it that he was so pleased with 
this result that he arranged to have an image similar to Figure [T] inscribed 
on his tomb. 



Figure 1: An illustration of the inscription on Archimedes' tomb. 
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More precisely, in the work "On the Sphere and Cylinder, Book I" as 
translated on Page 1 of [2], Archimedes states that for every plane per- 
pendicular to the axis of the tube of the cylinder, the surface areas lying 
above the plane on the sphere and on the tube are equal. See Figure [2] for 
illustration and also the discussion around Corollary 7 of [I]. 




Figure 2: The surface area of the shaded "cap" of the sphere above a plane 
is equal to the striped surface area on the tube of the cylinder above the 
same plane. 

In probabilistic terms, if a point is picked uniformly at random according 
to surface area on the unit sphere in three dimensions, then its projection 
onto any given axis having origin at the center of the sphere is uniformly 
distributed on the interval (—1,1), independent of the angular component 
in the plane perpendicular to that axis. Formally, we have the following 
result. 

Proposition 1.1. IfV is uniformly distributed on the interval (—1, 1) and 
Q is uniformly distributed on the interval (0, 2tt) and is independent of V , 
then 



(y, Vl-V 2 cos(G), y/l - V 2 sin(e) 



is uniformly distributed on the surface of the two dimensional sphere of 
radius one. 

In this article, we take Proposition 1 1 . 1 1 as a starting point for a discussion 
of characterizations of the centered Gaussian distribution which arise in 
Stein's method of distributional approximation. This discussion culminates 



in Theorem 1.6 at the end of this section. We then generalize some of these 
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results in Section [2] to obtain the characterization of the gamma distribution 



found in Proposition [27TJ and also mention an analog of Theorem 1.6 for the 
exponential distribution. We conclude in Section [3] with a discussion of some 
related literature. 

To move from Archimedes' result above to characterizing the Gaussian 
distribution, we state the following result which was first realized by the 
astronomer Herschel and made well known by the physicist Maxwell in his 
study of the velocities of a large number of gas particles in a container; see 
the introduction of [6]. 

Proposition 1.2. Let X = (X\, X 2 , X$) be a vector of independent and 
identically distributed (i.i.d.) random variables. Then X\ has a mean zero 
Gaussian distribution if and only if for all rotations R : IR 3 — > R 3 , RX has 
the same distribution as X. 

Propositions |1.1| and |1.2| are related by the following observations. It 
is clear that if X is an R 3 /{0} valued random vector such that i?X has 
the same distribution as X for all rotations R, then X/||X|| is a rotation 
invariant distribution on the surface of the two dimensional unit sphere 
and is independent of ||X|| := yj X\ + X\ + A| . Since the unique rotation 
invariant distribution on the surface of a sphere of any dimension is the 
uniform distribution (Theorem 4.1.2 of [6j), the propositions of Archimedes 
and Herschel-Maxwell suggest the following characterization of mean zero 
Gaussian distributions; we provide a proof and discussion of generalizations 
in the Appendix. 

Proposition 1.3. Let X= (X\, X 2 , X3) be a vector of i.i.d. random vari- 
ables. Then X\ has a mean zero Gaussian distribution if and only if for V 
uniform on ( — 1, 1) and independent of X, 



Xi = V\/Xf + X£ + X 3 . 



d 



Here and in what follows, = denotes equality in distribution of two ran- 
dom variables. The distribution of yj X\ + X| + X| , where Xi,X%,X3 are 
independent standard normal variables, is referred to as the Maxwell or 
Maxwell-Boltzman distribution; see page 453 of 



Proposition 1.3 characterizes centered Gaussian distributions as the one 
parameter scale family of fixed points of the distributional transformation 
which takes the distribution of a random variable X to the distribution of 
Vy/Xj+Xj+Xj, where X 1 ,X 2 ,X 3 are i.i.d. copies of X, and V is uniform 
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on (—1,1) independent of (X±, X2, X3). Such characterizations of distribu- 
tions as the unique fixed point of a transformation are often used in Stein's 
method for distributional approximation (see [20] for an introduction). In 
the case of the Gaussian distribution, these transformations are put to use 
through Stein's Lemma. 

Lemma 1.4 (Stein's Lemma). 121)/ A random variable W has the mean 
zero, variance one Gaussian distribution if and only if for all absolutely 
continuous functions f with bounded derivative, 

mf(w) = mwf{w). 



We can relate the characterizations provided by Proposition 1.3 and 



Lemma 1.4 but first we need the following definition. 



Definition 1.1. Let X be a random variable with distribution function 
F and such that fi a := E|X| a < 00. We define F", the a-power bias 
distribution of F, by the relation 

dF^( X )= ^ adF{x \ 

Ha 

and we write X^ for a random variable having this distribution. Otherwise 
put, X^ has the a-power bias distribution of X if and only if for every 
measurable function / such that E|X| Q |/(X)| < 00, 

^ (fl) ) = ™ . (1.1) 

Taking a = 1 and X ^ 0, X^ has the size-biased distribution of X, a 
notion which frequently arises in probability theory and applications [5] . 
We can now state and prove the following result which sheds some light 



on the relationship between Proposition 1.3 and Lemma 1.4 



Lemma 1.5. If W is a random variable with finite second moment and 
f is an absolutely continuous function with bounded derivative, then for V 
uniform on the interval (—1, 1) and independent ofW, 

2-EW 2 mf{VW {2) ) = EWf(W) - MWf(-W). (1.2) 

Proof. The lemma is implied by the following calculation 

1 r rl -1 

lEf'(VW {2) ) = -E / f'(uW {2) )du 

2 [J -1 
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/( ^(2))-/(-^( 2 )) 



iy(2) 

EW/(W) - Wf(-W) 
2EiW 2 ' 



where in the final equality we use (1.1). □ 



We now have the following main result for the Gaussian distribution. 

Theorem 1.6. Let W be a random variable with finite second moment. The 
following are equivalent: 

1. W has the standard normal distribution. 

2. For all absolutely continuous functions f with bounded derivative, 

mf{w) = mvf(W). (i.3) 

3. EiW 2 = 1 and W = VW^ , where V is uniform on (—1, 1) and inde- 
pendent ofW( 2 \ 

Proof. The equivalence of the first two items of the proposition is (Stein's) 
Lemma 11.41 above. 



The fact that Item 1 implies Item 3 follows from Proposition 1.3 above 
coupled with the simple fact that for X±, X2, X% i.i.d. standard normal ran- 
dom variables, the density of (Xf + X 2 + X 2 ) 1 / 2 is proportional to x 2 e~ x I 2 
(that is, (X 2 + Xl + Xl) 1 / 2 has the same distribution as ^ 2 ^). 

Finally, we show Item 2 follows from Item 3. If W = VW^ and 
WjW 2 = 1, then using Lemma|l.5|we find that for functions / with bounded 



derivative, 

Mf(W) = Mf'{VW {2) ) = i (EWf(W) - MWf(-W)) = MWf(W), 

where the last equality follows from the assumptions of Item 3 which imply 
W has the same distribution as — W. □ 

Remark 1.2. The equivalence of Items 1 and 3 is essentially the content 
of Proposition 2.3 of jS], which uses the concept of the "zero-bias" transfor- 
mation of Stein's method, first introduced in |llj . For a random variable 
W with mean zero and variance a 2 < 00, we say that W* has the zero-bias 
distribution of W if for all / with TE\Wf(W)\ < 00, 

a 2 1Ef'(W*) = mWf{W). 
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We think of the zero-bias transformation acting on probability distribu- 
tions with zero mean and finite variance, and Stein's Lemma implies that 
this transformation has the centered Gaussian distribution as its unique 
fixed point. Proposition 2.3 of [8] states that for a random variable W 
with support symmetric about zero with unit variance, the transformation 
W — > VW^ provides a representation of the zero-bias transformation. The 
equivalence of Items 1 and 3 of the theorem follows easily from these results. 



2 BETA-GAMMA ALGEBRA 



The equivalence between Items 1 and 3 in Theorem 1.6 can be generalized 
as follows. For r,s > 0, let G r , and B T:S denote standard gamma and 
beta random variables having respective densities p^yx r_1 e _x , x > and 

r(r)r(l) V r ~ l {^ ~ u) s1 ^ < V < 1) where V denotes the gamma function. 



Proposition 2.1. Fix p,r,s > 0. A non-negative random variable W 
has the distribution of c Gr for some constant c > if and only if W = 
Br jS W( s / p \ where B r ^ s is independent ofW^ s ^ p \ 



Remark 2.1. The equivalence in Items 1 and 3 of Theorem 1.6 follows by 
taking p = r = 1/2, s = 1 in Proposition 2.1 and using the well known fact 

that for Z having the standard normal distribution, Z 2 = c lG\ji- 



The proof of Proposition 2.1 uses the following result. 

Lemma 2.2. Let a,/3 > 0. If X ^ is a random variable such that 
WjX a < oo, then 

(X^))* 8 = (A /3 ) (a//3) . 

Proof. By the definition of a//3-power biasing, we only need to show that 

EJ a E/((j( a ')' 3 ) = lEX a f(xP) (2.1) 

for all / such that the expectation on the left hand side exists. By the 
definition of ce-power biasing, we have that for g(t) = f(t ), 

TEX a TEg(X (a) ) = TEX a g(X), 



which is (2.1). □ 
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Proof of Proposition 2.1 The usual beta-gamma algebra (see [9]) implies 

and G r 



that G r = B rs G r 4-f! where B 
mentary fact that G 



r+s wnclc *-T,s <* JJ -»-'- '"'r+s 

r+s = G r s \ we find t 



2.2 



are independent. Using the ele- 
lat for fixed r,s > 0, G r satisfies 
to G r with a = s and j3 = p, we 



G r = B rtS G r . Now applying Lemma 

have that W = Gr satisfies W = Br S W^ p ^ and the forward implication 

now follows after noting that (cX)^ = cX^ 

Now, assume that W = Br j8 W^ 8 ^ for fixed p, r, s > and we show that 

W = cGr for some c > 0. First, note by Lemma 2/2 that if X = W 1 ^, 
then 



X = B r ,X^ 



(2.2) 



and we will be done if this implies that X = G r . Note that by writing X^ s \ 
we have been tacitly assuming that ETU S//p = EJ S < oo, which implies 
that E,(B riS X^) s < oo so that using the definition of power biasing yields 
1EiX 2s < oo. Continuing in this way we find that 1EX ks < oo for all k = 
1,2,... and thus that EX P < oo for all p ^ s. Moreover, writing a& := 
1EX ks , and taking expectations in (2.2) after raising both sides to the power 
k, we have 

n — w nks a k+l 

where we have again used the definition of power biasing. We can solve this 
recursion after noting that for a > —r, 

T(r + a)T(r + s) 
r ' s r(r + a + s)r(r)' 



to find that for k = 0, 1, . . . 



ak 



a x r(r) y T(r + sk) 



r(r + s) 



r(r) 



For any value of a\ > 0, it is easy to see using Stirling's formula that the 
sequence (a&)fcj>i satisfies Carleman's condition 



fc=l 



-l/2k 



21: 



oo, as n 



oo, 
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so that for a given value of ai, there is exactly one probability distribution 
having moment sequence (a,k)k^l ( see the- remark following Theorem (3.11) 
in Chapter 2 of [H3]). Finally, it is easy to see that the random variable 



2.1 Exponential Distribution 

The exponential distribution has many characterizing properties, many of 
which stem from its relation to Poisson processes. For example, by superim- 
posing two independent Poisson processes into one, we easily find that if Z\ 
and Z2 are independent rate one exponential variables, then 2min{Zi,Z2} 
is also a rate one exponential (this is in fact characterizing as shown in 
Theorem 3.4.1 of [6]). 

For our framework above, we use the memoryless property of the ex- 
ponential distribution in the context of renewal theory. In greater detail, 
for any non- negative random variable X, we define the renewal sequence 
generated from X as (Si, S2, . . .), where Sj = Ylk=i -^k and the X^ are i.i.d. 
copies of X. For a fixed t > 0, the distribution of the length of the interval 
[SK t , SK t +i} containing t and the position of t in this interval depend on t 
and the distribution of X in some rather complicated way. We can remove 
this dependence on t by starting the sequence in "stationary" meaning that 
we look instead at the sequence {X' , X' + Si, . . .), where X' has the limiting 
distribution of SK t +i — t as t goes to infinity; see Chapter 5, Sections 6 and 
7.b of H3J. 

If X is a continuous distribution with finite mean, then the distribution 
of X' is the size-biased distribution of X times an independent variable 
which is uniform on (0, 1) jl3]. Heuristically, the memoryless property which 
characterizes the exponential distribution (Chapter 12 of [1]) implies that 
the renewal sequence generated by an exponential distribution is stationary 
(that is, X and X' have the same distribution) and vice versa. The following 
result implies this intuition is correct. 

Theorem 2.3. JT6]/ Let W be a non-negative random variable with finite 
mean. The following are equivalent: 

1. W has the exponential distribution with mean one. 

2. For all absolutely continuous functions f with bounded derivative, 





□ 



mf{w) = mf{w) - f(o). 



s 



3. EW = 1 and W = UW^ l \ where U is uniform on (0, 1) and indepen- 
dent ofWW. 

Similar to the case of the normal distribution, the crucial link between 



Items 2 and 3 of Theorem 2.3 is provided by the following lemma; the proof 



is similar to that of Lemma 11.51 

Lemma 2.4. IfW is a non-negative random variable with finite mean and 
f is an absolutely continuous function with bounded derivative, then 

MWmf{UW {1) ) = Mf(W) - /(0). 



Proof of Theorem 2.3. The equivalence of Items 1 and 3 is a special case of 



Theorem |2.3| with r = s = p = 1, and the equivalence of Items 2 and 3 



can be read from Lemma 2.4 (note in particular that Item 2 with f(x) = 1 



implies that EVF = 1). □ 

Remark 2.2. For a non- negative random variable W with finite mean, the 
transformation W —> UW^ is referred to in the Stein's method literature as 
the "equilibrium" transformation, first defined in this context in [IB], where 
Theorem 12.31 is also shown. 

Due to the close relationship between the exponential and geometric 
distributions, it is not surprising that there is a discrete analog of Theorem 



2.3 with the exponential distribution replaced by the geometric; see [T7] for 



this discussion in the context of Stein's method. 

3 PROOF OF PROPOSITION O AND DISCUSSION 



Proof of Proposition 1.3 We will show that for n ^ 2 and Y\, . . . ,Y n non- 
negative i.i.d. random variables, Y\ = cGi/^ n _i^ for some c > if and only 
if 

Yi =B 1/(n _ lll (Y 1 + --- + Y n ), (3.1) 

where -Bi/( n -i),i is independent of (Y\, . . . , Y n ), and G a , B a ^ are gamma and 
beta variables as defined above. The proposition then follows from this fact 
with n = 3 after noting that V 2 = -Bi/2,1 and if X has a mean zero and 

variance one normal distribution, then X 2 = 2Gi/2- 

The forward implication is a consequence of Proposition |2.1| coupled 
with the fact that G a +b = G a + Gb, where G a and Gb are independent. To 
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establish the result we assume (3.1) and show Y\ = cGiu n _iy 
assume that Y\ is non-negative, we define the Laplace transform 

<p(\) = Ee~ Ayi ,A ^ 0. 



Since we 



By conditioning on the value of Biit n _i\i in (3.1), we find for A > 0, 



<fW = %(%-i),iA) fl 



n 



(n 



u 



-(n-2)/(n-l) 



ip(u\) n du 



* n [ x t-^y^vitrdt, 

l)AV("-i) J rK ' ' 



where we have made the change of variable t = u\ in the last equality. We 
can differentiate the equation above with respect to A which yields 



(n-l) 2 
-y(A) + y>(A)" 
(n-l)A 



r (n-2)/(n-l) lf ^n dt + 



1 

(n- 1)A 



(3.2) 



Thus, we find that ip satisfies the differential equation (3.2) with boundary 
condition (f(0) = 1. 

By computing the derivative using (3.2) and using that < y?(A) ^ 1 for 
A > 0, we find that for some constant c ^ 0, 

— 7-7 1— = C, A > 0. 

A^A)™- 1 

Solving this equation for ip(\) implies 

^(A) = (l + cA)- 1 /("- 1 ), 



which is the Laplace transform of cG 



l/(n-l), 



as desired. 



□ 



The proof of Proposition 1.3 and the beta-gamma algebra suggest the 
following conjecture. 

Conjecture 3.1. Let n ^ 2 and Y = (Y\, Y2, . . . , Y n ) be a vector of i.i.d. 
random variables. Then Y\ is equal in distribution to cG a for some constant 
c > if and only if for V = B a r n _i\ a independent of Y, 



Y 1 = V(Y 1 +Y 2 + ... + Y n ). 



(3.3) 
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The forward implication of the conjecture is an easy consequence of 
the following beta-gamma algebra facts: for G a , Gj,, and independent, 

B a ,bG a +b = G a , and G a + G\> = G a +b- 

Conversely, assuming (3.3), it is possible to follow the proof of Proposi- 
tion 1.3, which leads to an integral equation for the Laplace transform of Y\. 
It is easy to verify that the Laplace transform of the appropriate gamma dis- 
tribution satisfies this equation, so it is only a matter of showing the integral 
equation has a unique scale family of solutions. In the case a = l/(n — 1) 
the integral equation has a simpler form from which the required uniqueness 
follows from the proof of Proposition 1.3 above. In the general case, we do 
not have an argument for the uniqueness of the solution. However, under 
the assumption that Y\ has all positive integer moments finite, the conjec- 
ture follows after using (3.3) to obtain a recursion relation for the moments 
which, up to the scale factor, determines those of a gamma distribution with 
the appropriate parameter. 



Conjecture 3.1 is very similar to Lukacs' characterization of the the 
gamma distribution [14] that positive, non-degenerate, independent vari- 
ables X, Y have the gamma distribution if and only if X + Y and Xj (X + Y) 
are independent. However, it does not appear that this result can be used to 
show the difficult implication of the conjecture. Note also that Lukacs' re- 
sult also characterizes beta distributions as the only distributions which can 
be written as X/(X + Y) independent of X + Y for positive, non-degenerate, 
independent variables X, Y. Thus, a question related to our conjecture is 
that if (3.3) holds for independent variables Yi,...,Y n and V, does this 
imply that V has a beta distribution? 

Conjecture |3.1| is connected to the observation of Poincare (see the in- 
troduction of |15| ) that the coordinates of a point uniformly chosen on the 
(n — 1) dimensional sphere of radius \fn are asymptotically distributed as in- 
dependent standard Gaussians. Analogous to the discussion in the introduc- 
tion, we can realize these uniformly distributed points as y/nR~ 1 (Xi, . . . , X n ), 
where X\,...,X n are independent standard normal variables and R = 
(Xf + - ■ ■+X^ l ) 1 / 2 . Squaring these coordinates, Poincare's result implies that 

nXf/(Xf + - • is asymptotically distributed as X\ . Since X\ 4 2G 1/2 , 

taking the limit as n — > oo on the right side of (3.3) with a = 1/2 yields a 
related fact. 

The forward implication of Proposition 1.3 is evidenced also by creation 
of a three-dimensional Bessel process by conditioning a one-dimensional 
Brownian motion not to hit zero. Indeed, a process version of Proposition 



1.3 is involved in the proof of the "2M — X" theorem provided in |19j : see 
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Section 2 and especially Section 2.3 of [7j. More generally, process analogs 
of the beta-gamma algebra can be found in Section 3 of |7j. 

Some extensions of the characterizations discussed in this article to more 
complicated distributions can be found in the recent work |18j . 
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